Stageless night
Stagefull night
Time asleep per date
Stage duration(h) per date
Average bedtime / waketime
Distribution of rem and deep sleep in a 10-day period
Distribution of rem and deep sleep for nights starting after 12pm
Rem and Deep per bedtime: early vs late
Comparison Deepsleep per bedtime
Rem + Deep per light sleep and duration
Explaining correlation
Correlation of time_asleep and temperature
Correlation to weather: split into two dateranges
Does bedtime affect sleep - matrix?
Correlation between light and rem sleep
Correlation between bedtime and wakte stage duration
Does temperature, sunrise/sunset affect sleep stages, sleep time, wake time, sleep duration - matrix?
Correlation between temperature and bedtime
Does Sunrise/ Sunset affect stages or bed-/waketimes?
Correlation sunset, bedtime
Not all of the results are clearly depicted in the data, but often rather validate scientifici studies. Of course, the limited amount of data is not enough for real claims. Other factors affected the data, such as end of my vacation in late summer.
Exploring sleepstages further:
rem + deep antiprop to light sleep Rem + Deep per light sleep and duration
correlation between light and rem sleep : -0.75, between light and deep sleep only - 0.57,
maybe: you are woken up more in the second part of the night, since it is louder, also deep sleep wont allow you to wake up as easily as rem sleep
furthermore, the amount of rem sleep is also determined by emotional state / events during the day - whereas physical fatique does not vary as much.
Does bedtime affect sleep - matrix?
Nights: Classic(awake,restless,asleep):
Detailed (stages):
import pandas as pd
import altair as alt
import datetime
import os
alt.themes.enable("dark")
alt.data_transformers.disable_max_rows()
dfs = []
for filename in os.listdir('sleepdata'):
if 'sample' in filename:
continue
df = pd.read_csv('sleepdata/'+filename)
df['dt'] = pd.to_datetime(df['dt'])
dfs.append(df)
dfs[0].head(5)
| Unnamed: 0 | dt | stage | type | efficiency | isMainSleep | dateOfSleep | startTime | endTime | duration | timeInBed | minutesAsleep | minutesAwake | minutesToFallAsleep | minutesAfterWakeup | weekday | hasNoSleepStages | sleep_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2021-06-06 01:00:00 | restless | classic | 31 | True | 2021-06-06 | 2021-06-06T01:00:00.000 | 2021-06-06T09:50:30.000 | 31800000 | 530 | 155 | 339 | 0 | 36 | Sunday | True | nostages_main |
| 1 | 1 | 2021-06-06 01:00:30 | restless | classic | 31 | True | 2021-06-06 | 2021-06-06T01:00:00.000 | 2021-06-06T09:50:30.000 | 31800000 | 530 | 155 | 339 | 0 | 36 | Sunday | True | nostages_main |
| 2 | 2 | 2021-06-06 01:01:00 | restless | classic | 31 | True | 2021-06-06 | 2021-06-06T01:00:00.000 | 2021-06-06T09:50:30.000 | 31800000 | 530 | 155 | 339 | 0 | 36 | Sunday | True | nostages_main |
| 3 | 3 | 2021-06-06 01:01:30 | restless | classic | 31 | True | 2021-06-06 | 2021-06-06T01:00:00.000 | 2021-06-06T09:50:30.000 | 31800000 | 530 | 155 | 339 | 0 | 36 | Sunday | True | nostages_main |
| 4 | 4 | 2021-06-06 01:02:00 | restless | classic | 31 | True | 2021-06-06 | 2021-06-06T01:00:00.000 | 2021-06-06T09:50:30.000 | 31800000 | 530 | 155 | 339 | 0 | 36 | Sunday | True | nostages_main |
dfs[0]['stage'].unique()
array(['restless', 'awake', 'asleep'], dtype=object)
# classic ['restless', 'awake', 'asleep'], detailed ['wake', 'light', 'rem', 'deep']
detailed = [x for x in dfs if 'asleep' not in x['stage'].unique()]
classic = [x for x in dfs if 'asleep' in x['stage'].unique()]
[len(detailed),len(classic)]
#['#164AA6','#408DFF','#7EC4FF','#F43C6F']
#colors
class C:
deep = '#164AA6'
rem = '#408DFF'
light = '#7EC4FF'
wake = '#F43C6F'
alt.Chart(classic[0]).mark_square(size=15).encode(x=alt.X('dt'),y=alt.Y('stage',sort=['awake','restless','asleep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=['darkblue','lightblue','purple']))).properties(width=800)
alt.Chart(detailed[0]).mark_square(size=15).encode(x=alt.X('dt'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=800)
#avg length of sleep
ldetailed = sum([len(df)/2 for df in detailed ]) / len(detailed)
lclassic = sum([len(df)/2 for df in classic ]) / len(classic)
print(ldetailed/60,lclassic/60)
9.018650793650794 8.720777777777778
# removing nights that are too short
#-> 21 outliner
del detailed[21]
General
data = []
for df in dfs:
data.append([df.dt[0],len(df)/2/60])
timeasleep_df = pd.DataFrame(data=data,columns=['dt','time_asleep'])
chart = alt.Chart(timeasleep_df,title='Time asleep').mark_line(color=C.light).encode(x='dt',y='time_asleep')
print('data includes period before sleep and after waking up')
timeasleep_chart = chart + chart.transform_regression('dt','time_asleep').mark_line(color='white')
timeasleep_chart
data includes period before sleep and after waking up
General:
data = []
for df in detailed:
wake = len(df.loc[df['stage']=='wake'])/2/60
light = len(df.loc[df['stage']=='light'])/2/60
rem = len(df.loc[df['stage']=='rem'])/2/60
deep = len(df.loc[df['stage']=='deep'])/2/60
l = [wake,rem,light,deep]
duration = (df.dt.iloc[-1]-df.dt[0]).total_seconds() / 60 /60
data.append([df.dt[0],duration,wake,'wake']+l)
data.append([df.dt[0],duration,light,'light']+l)
data.append([df.dt[0],duration,rem,'deep']+l)
data.append([df.dt[0],duration,deep,'rem']+l)
stagepercentage_general_df = pd.DataFrame(data= data,columns=['dt','duration','hours','stage','wake','rem','light','deep'])
stagepercentage_general_df['order'] = stagepercentage_general_df['stage'].replace(
{val: i for i, val in enumerate(['deep', 'light', 'rem', 'wake'])}
)
chart = alt.Chart(stagepercentage_general_df).mark_bar().encode(
x='dt',y=alt.Y('sum(hours)'),
color=alt.Color('stage'#,
# optional: make color order in legend match stack order
# sort=alt.EncodingSortField('order', order='descending')
,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake])
),
order='order'
)
chart1 = alt.Chart(stagepercentage_general_df).mark_line(color = C.light).transform_regression('dt','light').encode(x='dt',y='light')
chart2 = alt.Chart(stagepercentage_general_df).mark_line(color = C.wake).transform_regression('dt','wake').encode(x='dt',y='wake')
chart3 = alt.Chart(stagepercentage_general_df).mark_line(color = C.deep).transform_regression('dt','deep').encode(x='dt',y='deep')
chart4 = alt.Chart(stagepercentage_general_df).mark_line(color = C.rem).transform_regression('dt','rem').encode(x='dt',y='rem')
chart + chart1 + chart2 + chart3 + chart4 \
| chart3.transform_regression('dt','deep',params=True).mark_text(align='left',color='white').encode(
x=alt.value(5), # pixels from left
y=alt.value(-4), # pixels from top
text='coef:N'
) + chart1.transform_regression('dt','light',params=True).mark_text(align='left',color='white').encode(
x=alt.value(5), # pixels from left
y=alt.value(25), # pixels from top
text='coef:N'
) + chart4.transform_regression('dt','rem',params=True).mark_text(align='left',color='white').encode(
x=alt.value(5), # pixels from left
y=alt.value(50), # pixels from top
text='coef:N'
) + chart2.transform_regression('dt','wake',params=True).mark_text(align='left',color='white').encode(
x=alt.value(5), # pixels from left
y=alt.value(75), # pixels from top
text='coef:N'
)
stagepercentage_general_df
| dt | duration | hours | stage | wake | rem | light | deep | order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021-06-10 01:01:30 | 8.550000 | 1.308333 | wake | 1.308333 | 1.125000 | 5.041667 | 1.083333 | 3 |
| 1 | 2021-06-10 01:01:30 | 8.550000 | 5.041667 | light | 1.308333 | 1.125000 | 5.041667 | 1.083333 | 1 |
| 2 | 2021-06-10 01:01:30 | 8.550000 | 1.125000 | deep | 1.308333 | 1.125000 | 5.041667 | 1.083333 | 0 |
| 3 | 2021-06-10 01:01:30 | 8.550000 | 1.083333 | rem | 1.308333 | 1.125000 | 5.041667 | 1.083333 | 2 |
| 4 | 2021-06-17 00:35:00 | 9.041667 | 1.358333 | wake | 1.358333 | 1.550000 | 4.675000 | 1.466667 | 3 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 159 | 2021-10-18 22:15:00 | 8.633333 | 1.333333 | rem | 1.941667 | 1.133333 | 4.233333 | 1.333333 | 2 |
| 160 | 2021-10-19 22:59:30 | 9.550000 | 1.891667 | wake | 1.891667 | 1.125000 | 5.583333 | 0.958333 | 3 |
| 161 | 2021-10-19 22:59:30 | 9.550000 | 5.583333 | light | 1.891667 | 1.125000 | 5.583333 | 0.958333 | 1 |
| 162 | 2021-10-19 22:59:30 | 9.550000 | 1.125000 | deep | 1.891667 | 1.125000 | 5.583333 | 0.958333 | 0 |
| 163 | 2021-10-19 22:59:30 | 9.550000 | 0.958333 | rem | 1.891667 | 1.125000 | 5.583333 | 0.958333 | 2 |
164 rows × 9 columns
# get coeffs of different sleepstages: regression for the daterange
import numpy as np
from sklearn.linear_model import LinearRegression
import datetime
data = []
deeptimes, lighttimes, remtimes, waketimes = \
list(stagepercentage_general_df['deep']),\
list(stagepercentage_general_df['light']),\
list(stagepercentage_general_df['rem']),\
list(stagepercentage_general_df['wake'])
datetimes = [x.timetuple().tm_yday for x in stagepercentage_general_df['dt']]
#regression slope
x= np.array(datetimes).reshape((-1, 1))
y_deep= np.array(deeptimes)
y_rem = np.array(remtimes)
y_light = np.array(lighttimes)
y_wake = np.array(waketimes)
model_rem = LinearRegression().fit(x,y_rem)
model_light = LinearRegression().fit(x,y_light)
model_deep = LinearRegression().fit(x,y_deep)
model_wake = LinearRegression().fit(x,y_wake)
# printing the coefficients and the difference in hours of the y axis
print(f'Days {x[-1]-x[0]}\n Wake coef {model_wake.coef_}, dif {131*model_wake.coef_[0]} \n deep coef {model_deep.coef_} dif {131*model_deep.coef_[0]} \n rem coef {model_rem.coef_} dif {131*model_rem.coef_[0]}\n light coef {model_light.coef_} dif {131*model_light.coef_[0]}')
print(f'total dif {0.4109248806940455+0.12642730221843085+0.13530666341671496-0.18342549288113877} h')
Days [131] Wake coef [0.00313683], dif 0.4109248806940455 deep coef [0.00096509] dif 0.12642730221843085 rem coef [0.00103288] dif 0.13530666341671496 light coef [-0.00140019] dif -0.18342549288113877 total dif 0.4892333534480525 h
import numpy as np
from sklearn.linear_model import LinearRegression
import datetime
data = []
bedtimes= []
waketimes = []
datetimes = []
for df in dfs:
time = df.dt.iloc[0].time()
bed = datetime.datetime(2020,1,1,time.hour,time.minute,time.second)
if time.hour<7:
bed = datetime.datetime(2020,1,2,time.hour,time.minute,time.second)
bedtimes.append(time.hour+time.minute/60+time.second/60/60 +24)
else:
bedtimes.append(time.hour+time.minute/60+time.second/60/60)
time2= df.dt.iloc[-1]
waketimes.append(time2.hour+time2.minute/60+time2.second/60/60)
datetimes.append(df.dt.iloc[0].timetuple().tm_yday)
wake = datetime.datetime(2020,1,2,time2.hour,time2.minute,time2.second)
data.append([df.dt[0],bed,wake])
#regression slope
x= np.array(datetimes).reshape((-1, 1))
y_bed = np.array(bedtimes)
y_wake = np.array(waketimes)
model_bed = LinearRegression().fit(x,y_bed)
model_wake = LinearRegression().fit(x,y_wake)
print(f'avg bedtime {sum(y_bed)/len(y_bed)}, avg wake {sum(y_wake)/len(y_wake)}')
# print(model_bed.coef_,y_bed,x)
print(f'Days {x[-1]-x[0]} Wake coef {model_wake.coef_}, dif {135*model_wake.coef_[0]} Bed coef {model_bed.coef_} dif {135*model_bed.coef_[0]}')
print(f'total {abs(-1.9989724973080545 + 1.625821326285584) }')
avg_bed_wake_time_df = pd.DataFrame(data=data,columns=['dt','bed','wake'])
chart1 = alt.Chart(avg_bed_wake_time_df).mark_line(color=C.light).encode(x='dt',y='bed')
chart1_reg = chart1.mark_line(color=C.light).transform_regression('dt','bed')
chart2 = alt.Chart(avg_bed_wake_time_df).mark_line(color=C.wake).encode(x='dt',y='wake')
chart2_reg = chart2.mark_line(color=C.wake).transform_regression('dt','wake')
chart1 + chart1_reg + chart2 + chart2_reg
# )+ \
# + chart1_reg.transform_regression('dt','bed',params=True).mark_text(align='left',color='white').encode(
# x=alt.value(5), # pixels from left
# y=alt.value(100), # pixels from top
# text='coef:N'
#+ chart2_reg.transform_regression('dt','wake',params=True).mark_text(align='left',color='white').encode(
# x=alt.value(5), # pixels from left
# y=alt.value(-4), # pixels from top
# text='coef:N'
# )
avg bedtime 23.627777777777776, avg wake 8.447150997150999 Days [135] Wake coef [-0.01204312], dif -1.625821326285584 Bed coef [-0.0148072] dif -1.9989724973080545 total 0.3731511710224704
Comparing the difference through regression for sleepstages and wake/-bedtime, they are in similar ranges:
total dif 0.4892333534480525 h
total 0.3731511710224704
chart = None
for df in detailed[:10]:
# all same time, next day if time < 18 else this day for 22 23 0 1 2 3 issue
df['time'] = df['dt'].apply(lambda x: x.replace(month=1,day=2) if x.hour<18 else x.replace(month=1,day=1) )
if chart == None:
chart = alt.Chart(df).mark_square(size=20,opacity=0.2).encode(x=alt.X('time'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=1200)
else:
chart+= alt.Chart(df).mark_square(size=20,opacity=0.2).encode(x=alt.X('time'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=1200)
chart
#nights starting after 12pm
chart = None
for df in detailed:
if df['dt'].dt.hour[0]<5:
# print('pas',df['dt'][0])
pass
else:
continue
# all same time, next day if time < 18 else this day for 22 23 0 1 2 3 issue
df['time'] = df['dt'].apply(lambda x: x.replace(month=1,day=1) if x.hour<18 else x.replace(month=1,day=1) )
if chart == None:
chart = alt.Chart(df).mark_square(size=20,opacity=0.2).encode(x=alt.X('time'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=1200)
else:
chart+= alt.Chart(df).mark_square(size=20,opacity=0.2).encode(x=alt.X('time'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=1200)
chart
gobed_early = stagepercentage_general_df.loc[stagepercentage_general_df['dt'].dt.hour>=5]
gobed_late = stagepercentage_general_df.loc[stagepercentage_general_df['dt'].dt.hour<5]
print('late deep-sleep:',gobed_late['deep'].mean(),'rem',gobed_late['rem'].mean())
print('early deep:',gobed_early['deep'].mean(),'rem:',gobed_early['rem'].mean())
data = []
data.append(['late',gobed_late['deep'].mean(),gobed_late['rem'].mean()])
data.append(['early',gobed_early['deep'].mean(),gobed_early['rem'].mean()])
df = pd.DataFrame(data=data,columns=['time','deep','rem'])
alt.Chart(df).mark_bar().encode(x='time',y='deep') + alt.Chart(df).mark_bar().encode(x='time',y='rem')
late deep-sleep: 1.2272727272727268 rem 1.2984848484848486 early deep: 1.2469444444444444 rem: 1.2013888888888888
# didnt sleep much = 2021-10-03
# alcohol seems to increase my rem and deepsleep?
alt.Chart(detailed[29],title='alcohol').mark_square(size=15).encode(x=alt.X('dt'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=600)
# 2021-08-12 not much deep sleep
alt.Chart(detailed[10],title='not much deep sleep').mark_square(size=15).encode(x=alt.X('dt'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=900)
alt.Chart(detailed[34],title='normal sleep').mark_square(size=15).encode(x=alt.X('dt'),y=alt.Y('stage',sort=['wake','rem','light','deep']),\
color=alt.Color('stage',legend=None,scale=alt.Scale(range=[C.deep,C.light,C.rem,C.wake]))).properties(width=900)
df= stagepercentage_general_df.copy()
df = df.loc[df['duration']>6.850000]
df['rem and deep']= df['rem'] + df['deep']
alt.Chart(df).mark_line(color=C.light).encode(x='light',y='rem and deep') | alt.Chart(df).mark_line(color=C.light).encode(x='duration',y='rem and deep')
weather_df = pd.read_csv('Wetter_Speyer_31Okt_2020_2021_Einfach.csv',sep=',')
weather_df['dt'] = weather_df.Tag.apply(lambda x: x.split('.')[2]+'-'+x.split('.')[1]+'-'+x.split('.')[0]+' ') +weather_df.Stunde
weather_df['dt'] = pd.to_datetime(weather_df['dt'])
weather_df.head(5)
| Unnamed: 0 | Unnamed: 0.1 | Tag | Stunde | avg_rel_Humidpercent | avg_temp20cm | avg_temp200cm | avg_boden_temp5cm | avg_boden_temp20cm | avg_wind_velocity2_5meter | sum_glob_strahlung_Wh/m2 | niederschlag_nn | dt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 31.10.2020 | 00:00 | 83.0 | 12.5 | 13.2 | 12.4 | 12.7 | 0.8 | 0.0 | 0.0 | 2020-10-31 00:00:00 |
| 1 | 1 | 1 | 31.10.2020 | 01:00 | 82.9 | 12.5 | 13.1 | 12.3 | 12.7 | 0.9 | 0.0 | 0.0 | 2020-10-31 01:00:00 |
| 2 | 2 | 2 | 31.10.2020 | 02:00 | 83.4 | 12.2 | 12.9 | 12.2 | 12.7 | 0.8 | 0.0 | 0.0 | 2020-10-31 02:00:00 |
| 3 | 3 | 3 | 31.10.2020 | 03:00 | 83.7 | 11.9 | 12.8 | 12.2 | 12.6 | 0.5 | 0.0 | 0.0 | 2020-10-31 03:00:00 |
| 4 | 4 | 4 | 31.10.2020 | 04:00 | 84.5 | 11.6 | 12.6 | 12.2 | 12.6 | 0.4 | 0.0 | 0.0 | 2020-10-31 04:00:00 |
weather_df_selected = weather_df.loc[(weather_df['dt']>datetime.datetime(2021,6,5)) & (weather_df['dt']<datetime.datetime(2021,10,21))]
c11 = alt.Chart(weather_df_selected).mark_square(size=15).encode(x='dt',y='avg_temp200cm',color=alt.Color('avg_temp200cm',scale=alt.Scale(range=['lightyellow','yellow','red']),title='Temperature(°C)'))
c12 = alt.Chart(weather_df_selected).mark_bar().encode(x='dt',y='niederschlag_nn',color=alt.Color('niederschlag_nn',scale=alt.Scale(range=['dodgerblue','darkblue']),title='Rainfall(mm)'))
c1 = alt.layer(
c11,
c12
).resolve_scale(color='independent')
c2= alt.Chart(weather_df_selected).mark_line(color='green').encode(x='dt',y='avg_wind_velocity2_5meter')
c3 = alt.Chart(weather_df_selected).mark_bar(color='yellow').encode(x='dt',y='sum_glob_strahlung_Wh/m2') \
| alt.Chart(weather_df_selected).mark_line(color='purple').encode(x='dt',y='avg_rel_Humidpercent')
alt.vconcat(alt.hconcat(c1,c2),c3)
# need data from 2021-06-06 till 2021-10-20
weather_df_selected = weather_df.loc[(weather_df['dt']>datetime.datetime(2021,6,5)) & (weather_df['dt']<datetime.datetime(2021,10,21))]
weather_df_selected = weather_df_selected.loc[(weather_df_selected['dt'].dt.time>datetime.time(0)) & (weather_df_selected['dt'].dt.time<datetime.time(3))]
c11 = alt.Chart(weather_df_selected).mark_line(color='Yellow').encode(x='dt',y='avg_temp200cm')
c12 = alt.Chart(weather_df_selected).mark_bar().encode(x='dt',y='niederschlag_nn',color=alt.Color('niederschlag_nn',scale=alt.Scale(range=['dodgerblue','darkblue']),title='Rainfall(mm)'))
c1 = alt.layer(
c11,
c12
).resolve_scale(color='independent')
c2= alt.Chart(weather_df_selected).mark_line(color='green').encode(x='dt',y='avg_wind_velocity2_5meter')
c3 = alt.Chart(weather_df_selected).mark_bar(color='yellow').encode(x='dt',y='sum_glob_strahlung_Wh/m2') \
| alt.Chart(weather_df_selected).mark_line(color='purple').encode(x='dt',y='avg_rel_Humidpercent')
alt.vconcat(alt.hconcat(c1,c2),c3)
tempdf1 = weather_df.loc[weather_df['Tag'] =='01.01.2021']
tempdf2 = weather_df.loc[weather_df['Tag'] =='01.03.2021']
tempdf3 = weather_df.loc[weather_df['Tag'] =='01.06.2021']
tempdf4 = weather_df.loc[weather_df['Tag'] =='01.09.2021']
totaldf = pd.concat([tempdf1,tempdf2,tempdf3,tempdf4])
chart = alt.Chart(totaldf,title='Sun radiation').mark_line().encode(x='Stunde',y='sum_glob_strahlung_Wh/m2',color=alt.Color('Tag',scale=alt.Scale(range=['darkred','red','yellow','orange'])))
chart
data = []
for name, df in weather_df.groupby('Tag'):
entry = df.loc[df['sum_glob_strahlung_Wh/m2'] != 0].iloc[0]
entry2 = df.loc[df['sum_glob_strahlung_Wh/m2'] != 0].iloc[-1]
data.append([datetime.datetime.strptime(str(entry['dt']),'%Y-%m-%d %H:%M:%S'),datetime.datetime.strptime(str(entry['dt']),'%Y-%m-%d %H:%M:%S').date(),int(entry['Stunde'].replace(':00','')),int(entry2['Stunde'].replace(':00',''))])
df_sunrise_set = pd.DataFrame(data=data,columns=['dt','date','sunrise','sunset'])
alt.Chart(df_sunrise_set[['dt','sunrise']],title='Sun rise').mark_line(color='orange').encode(x='dt',y='sunrise')
weather_df_selected = weather_df.loc[(weather_df['dt']>datetime.datetime(2021,6,5)) & (weather_df['dt']<datetime.datetime(2021,10,21))]
weather_df_selected = weather_df_selected.loc[(weather_df_selected['dt'].dt.time==datetime.time(21)) | (weather_df_selected['dt'].dt.time==datetime.time(1))| (weather_df_selected['dt'].dt.time==datetime.time(4))]
temperature_chart = alt.Chart(weather_df_selected[weather_df_selected.index % 5 != 0],title='Temperature at 9pm, 1am, 4am').mark_line(color='Yellow').encode(x='dt',y='avg_temp200cm',color=alt.Color('Stunde',scale=alt.Scale(range=['lightgreen','limegreen','gainsboro'])))
temperature_chart
[len(weather_df_selected),len(timeasleep_df)]
[414, 117]
from math import sqrt
def get_corr(temp,sleep):
n = len(temp)
numerator = n * sum(temp*sleep) - sum(temp)*sum(sleep)
denominator = sqrt((n * sum(temp**2) - sum(temp)**2)) * sqrt((n * sum(sleep**2) - sum(sleep)**2) )
return numerator / denominator
x = pd.Series(data=[3,5,6.5,4.5])
y = pd.Series(data=[1,1.4,1.3,1])
c1 = alt.Chart(pd.DataFrame({'x':x,'y':y})).mark_point(color=C.wake,size=50).encode(x=alt.X('x',scale=alt.Scale(domain=[2,7])),y=alt.Y('y',scale=alt.Scale(domain=[0.8,1.6])))
reg1 = c1 + c1.mark_line(color=C.light).transform_regression('x','y').interactive()
print('c = -1 c =',get_corr(x,y))
x = pd.Series(data=[0.5,1.5,2.5,3.5])
y = pd.Series(data=[2,1.5,1,0.5])
c1 = alt.Chart(pd.DataFrame({'x':x,'y':y})).mark_point(color=C.wake,size=50).encode(x=alt.X('x',scale=alt.Scale(domain=[0,4])),y=alt.Y('y',scale=alt.Scale(domain=[0,3])))
reg2 = c1 + c1.mark_line(color=C.light).transform_regression('x','y').interactive()
reg2 | reg1
c = -1 c = 0.7001400420140103
#time asleep adjusted to scale
data1 = []
for df in dfs:
date = df['dt'].dt.date[0]
data1.append([datetime.datetime(date.year,date.month,date.day),(len(df)/ 2 / 60 / 8.5)**3 * 15 ])
timeasleep_corr_df = pd.DataFrame(data=data1,columns=['dt','time_asleep'])
timeasleep_corr_chart = alt.Chart(timeasleep_corr_df,title='Correlation: Time asleep(adjusted) - Temperature').mark_line(color=C.light).encode(x='dt',y='time_asleep')
timeasleep_corr_chart_reg = timeasleep_corr_chart.mark_line(color='darkblue').transform_regression('dt','time_asleep',method='poly')
timeasleep_corr_chart += timeasleep_corr_chart_reg
weather_df_selected = weather_df.loc[(weather_df['dt']>datetime.datetime(2021,6,5)) & (weather_df['dt']<datetime.datetime(2021,10,21))]
# only take entries for which the days exist
data = []
for i,row in weather_df_selected.iterrows():
date= row['dt']
data.append(datetime.datetime(date.year,date.month,date.day))
weather_df_selected['date'] = data
print(len(weather_df_selected))
weather_df_selected = weather_df_selected.loc[weather_df_selected['date'].isin(timeasleep_corr_df['dt'])]
print(len(weather_df_selected))
weather_df_selected.reset_index(inplace=True)
# only take entries for which the days exist end
weather_df_selected = weather_df_selected.loc[( (weather_df_selected['dt'].dt.time==datetime.time(22)))]
temperature_chart = alt.Chart(weather_df_selected[weather_df_selected.index % 5 != 0],title='Temperature throughout the night').mark_line(color='Yellow').encode(x='date',y='avg_temp200cm')
temperature_chart_reg = temperature_chart.mark_line(color='gold').transform_regression('date','avg_temp200cm',method='poly')
temperature_chart += temperature_chart_reg
timeasleep_corr_chart + temperature_chart.interactive()
3311 2568
<ipython-input-1-634a7588bc3e>:26: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy weather_df_selected['date'] = data
#coefficients wrong
#result: both are in similar ranges
light= -1.945e-11
wake= 3.561e-11
deep= 1.107e-11
rem= 1.119e-11
waketime= -0.000504
bedtime= - 0.000622
#delta(waketime-bedtime) = |light + wake + deep + rem|
print(waketime-bedtime)
print(light+wake+deep+rem)
0.00011800000000000005 3.842e-11
#try changing anticorrelation to correlation
data1 = []
for df in dfs:
date = df['dt'].dt.date[0]
data1.append([datetime.datetime(date.year,date.month,date.day),(len(df)/ 2 / 60 / 8.5)**3 * 15 ])
timeasleep_corr_df = pd.DataFrame(data=data1,columns=['dt','time_asleep'])
timeasleep_corr_df['time_asleep'] = timeasleep_corr_df['time_asleep'].mean() - (timeasleep_corr_df['time_asleep']-timeasleep_corr_df['time_asleep'].mean())
timeasleep_corr_chart = alt.Chart(timeasleep_corr_df,title='Correlation: Time asleep(adj,anticorr) - Temperature').mark_line(color=C.light).encode(x='dt',y='time_asleep')
timeasleep_corr_chart_reg = timeasleep_corr_chart.mark_line(color='darkblue').transform_regression('dt','time_asleep',method='poly')
timeasleep_corr_chart += timeasleep_corr_chart_reg
weather_df_selected = weather_df.loc[(weather_df['dt']>datetime.datetime(2021,6,5)) & (weather_df['dt']<datetime.datetime(2021,10,21))]
# only take entries for which the days exist
data = []
for i,row in weather_df_selected.iterrows():
date= row['dt']
data.append(datetime.datetime(date.year,date.month,date.day))
weather_df_selected['date'] = data
print(len(weather_df_selected))
weather_df_selected = weather_df_selected.loc[weather_df_selected['date'].isin(timeasleep_corr_df['dt'])]
print(len(weather_df_selected))
weather_df_selected.reset_index(inplace=True)
# only take entries for which the days exist end
weather_df_selected = weather_df_selected.loc[( (weather_df_selected['dt'].dt.time==datetime.time(22)))]
temperature_chart = alt.Chart(weather_df_selected[weather_df_selected.index % 5 != 0],title='Temperature throughout the night').mark_line(color='Yellow').encode(x='date',y='avg_temp200cm')
temperature_chart_reg = temperature_chart.mark_line(color='gold').transform_regression('date','avg_temp200cm',method='poly')
temperature_chart += temperature_chart_reg
timeasleep_corr_chart + temperature_chart.interactive()
3311 2568
<ipython-input-1-161910ac760d>:23: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy weather_df_selected['date'] = data
import matplotlib
#Getting Correlation
#https://en.wikipedia.org/wiki/Correlation
timeasleep_df['dateOnly']=pd.to_datetime(timeasleep_df['dt'].dt.date)
corr_df = weather_df_selected.merge(right=timeasleep_df, left_on='date',right_on='dateOnly',how='left')
#removing outliers
corr_df = corr_df.loc[corr_df['time_asleep']>4]
corr_df_1 = corr_df.loc[corr_df['dt_x']<datetime.datetime(2021,9,12)]
corr_df_2 = corr_df.loc[corr_df['dt_x']>=datetime.datetime(2021,9,12)]
l = list(corr_df.columns)[3:]
l.remove('sum_glob_strahlung_Wh/m2')
corr1 =corr_df_1[l].corr()['time_asleep']
corr2 =corr_df_2[l].corr()['time_asleep']
data1 = []
data2 = []
for x,v in corr1.items():
if x == 'time_asleep':
continue
data1.append([x,v])
for x,v in corr2.items():
if x == 'time_asleep':
continue
data2.append([x,v])
df1 = pd.DataFrame(data=data1,columns=['y','x'])
df2 = pd.DataFrame(data=data2,columns=['y','x'])
c1 = alt.Chart(df1,title='Correlation to "time asleep"').mark_rect().encode(
x='x',
y='y',
color =alt.Color('x',scale=alt.Scale(domain=[-1,1],range=['red','green']))
)
c2 = alt.Chart(df2,title='Correlation to "time asleep"').mark_rect().encode(
x='x',
y='y',
color =alt.Color('x',scale=alt.Scale(domain=[-1,1],range=['red','green','green']))
)
corr_matrix = c1 | c2
print('First',df1 )
print('Second',df2 )
c1 = alt.Chart(corr_df_1,title='till Sept. 12th').mark_point(size=15,color=C.light).encode(x='avg_temp20cm',y='time_asleep')
c2 = c1.mark_line().transform_regression('avg_temp20cm','time_asleep')
c_first = c1 + c2
c1 = alt.Chart(corr_df_2,title='starting Sept. 12th').mark_point(size=15,color=C.light).encode(x='avg_temp20cm',y='time_asleep')
c2 = c1.mark_line().transform_regression('avg_temp20cm','time_asleep')
c_second = c1 + c2
alt.vconcat( corr_matrix , c_first | c_second )
First y x 0 avg_rel_Humidpercent 0.156554 1 avg_temp20cm -0.237621 2 avg_temp200cm -0.237376 3 avg_boden_temp5cm -0.250240 4 avg_boden_temp20cm -0.267134 5 avg_wind_velocity2_5meter -0.155712 6 niederschlag_nn -0.112680 Second y x 0 avg_rel_Humidpercent -0.008040 1 avg_temp20cm -0.044888 2 avg_temp200cm -0.034701 3 avg_boden_temp5cm -0.076391 4 avg_boden_temp20cm -0.066906 5 avg_wind_velocity2_5meter 0.042423 6 niederschlag_nn 0.078354
# calc correlation on my own for temp and time_asleep, to understand correlation
# works, is for whole period
get_corr(corr_df['avg_temp20cm'],corr_df['time_asleep'])
-0.18768468424319587
data1 = []
data2 = []
# define sleep by when we go to bed
# or by when we start to sleep
for df in detailed:
startTime = df.iloc[0]['startTime']
endTime = df.iloc[-1]['endTime']
startTime= datetime.datetime.strptime(startTime,'%Y-%m-%dT%H:%M:%S.%f')
endTime= datetime.datetime.strptime(endTime,'%Y-%m-%dT%H:%M:%S.%f')
startTimeSleep = df.loc[df['stage']=='light'].values[0][1]
#endTimeSleep = df.loc[df['stage']=='light'].values[-1][1]
startTimeSleep= datetime.datetime.strptime(str(startTimeSleep),'%Y-%m-%d %H:%M:%S')
#endTimeSleep= datetime.datetime.strptime(str(endTimeSleep),'%Y-%m-%d %H:%M:%S')
#times in minutes
startTimeHours = startTime.hour + startTime.minute / 60 + startTime.second / 60 / 60
endTimeHours = endTime.hour + endTime.minute / 60 + endTime.second / 60 / 60
startTimeSleepHours = startTimeSleep.hour + startTimeSleep.minute / 60 + startTimeSleep.second / 60 /60
#shift minutes since there is the hours: 23,24 - 1 2 3 problem
if startTime.hour < 6:
startTimeHours += 24
if endTime.hour < 6:
endTimeHours += 24
if startTimeSleep.hour < 6:
startTimeSleepHours += 24
data1.append([startTime.date(),startTime.time(),startTimeSleep.time(),startTimeHours,startTimeSleepHours,endTimeHours])
startSleepTime_df = pd.DataFrame(data=data1,columns=['date','startTimeWake','startTimeSleep','startTimeHours','startTimeSleepHours','endTimeHours'])
stagepercentage_general_df['date'] = stagepercentage_general_df['dt'].dt.date
stage_p_and_startTime_df = stagepercentage_general_df.merge(startSleepTime_df,how='left',on='date')
stage_p_and_startTime_df.head(5)
| dt | duration | hours | stage | wake | rem | light | deep | order | date | startTimeWake | startTimeSleep | startTimeHours | startTimeSleepHours | endTimeHours | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021-06-10 01:01:30 | 8.550000 | 1.308333 | wake | 1.308333 | 1.125 | 5.041667 | 1.083333 | 3 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 1 | 2021-06-10 01:01:30 | 8.550000 | 5.041667 | light | 1.308333 | 1.125 | 5.041667 | 1.083333 | 1 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 2 | 2021-06-10 01:01:30 | 8.550000 | 1.125000 | deep | 1.308333 | 1.125 | 5.041667 | 1.083333 | 0 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 3 | 2021-06-10 01:01:30 | 8.550000 | 1.083333 | rem | 1.308333 | 1.125 | 5.041667 | 1.083333 | 2 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 4 | 2021-06-17 00:35:00 | 9.041667 | 1.358333 | wake | 1.358333 | 1.550 | 4.675000 | 1.466667 | 3 | 2021-06-17 | 00:35:00 | 00:42:00 | 24.583333 | 24.700000 | 9.625 |
stage_p_and_startTime_df.corr().style.background_gradient(cmap='coolwarm')
| duration | hours | wake | rem | light | deep | order | startTimeHours | startTimeSleepHours | endTimeHours | |
|---|---|---|---|---|---|---|---|---|---|---|
| duration | 1.000000 | 0.101417 | 0.638256 | -0.138149 | 0.595651 | 0.134661 | -0.000000 | -0.464541 | -0.447940 | 0.125227 |
| hours | 0.101417 | 1.000000 | 0.064730 | -0.014011 | 0.060409 | 0.013657 | -0.133312 | -0.047113 | -0.045429 | 0.012700 |
| wake | 0.638256 | 0.064730 | 1.000000 | -0.370498 | 0.396465 | -0.082317 | 0.000000 | -0.610676 | -0.581844 | -0.274740 |
| rem | -0.138149 | -0.014011 | -0.370498 | 1.000000 | -0.752701 | 0.515303 | 0.000000 | 0.122070 | 0.102455 | 0.084231 |
| light | 0.595651 | 0.060409 | 0.396465 | -0.752701 | 1.000000 | -0.574646 | -0.000000 | -0.215444 | -0.203396 | 0.122912 |
| deep | 0.134661 | 0.013657 | -0.082317 | 0.515303 | -0.574646 | 1.000000 | 0.000000 | -0.017965 | -0.013591 | 0.069056 |
| order | -0.000000 | -0.133312 | 0.000000 | 0.000000 | -0.000000 | 0.000000 | 1.000000 | -0.000000 | -0.000000 | -0.000000 |
| startTimeHours | -0.464541 | -0.047113 | -0.610676 | 0.122070 | -0.215444 | -0.017965 | -0.000000 | 1.000000 | 0.985307 | 0.799889 |
| startTimeSleepHours | -0.447940 | -0.045429 | -0.581844 | 0.102455 | -0.203396 | -0.013591 | -0.000000 | 0.985307 | 1.000000 | 0.791977 |
| endTimeHours | 0.125227 | 0.012700 | -0.274740 | 0.084231 | 0.122912 | 0.069056 | -0.000000 | 0.799889 | 0.791977 | 1.000000 |
print(get_corr(stage_p_and_startTime_df['light'],stage_p_and_startTime_df['rem']))
c1 = alt.Chart(stage_p_and_startTime_df[['light','rem']],title='Correlation between light and rem sleep').mark_point(size=15,color=C.light).encode(x='light',y='rem')
c1 + c1.mark_line().transform_regression('light','rem')
-0.752701072915699
print(get_corr(stage_p_and_startTime_df['startTimeSleepHours'],stage_p_and_startTime_df['wake']))
c1 = alt.Chart(stage_p_and_startTime_df[['startTimeSleepHours','wake']],title='Correlation between starTimeHours and wake duration').mark_point(size=15,color=C.light).encode(x=alt.X('startTimeSleepHours',scale=alt.Scale(domain=[21,28])),y='wake')
c1 + c1.mark_line().transform_regression('startTimeSleepHours','wake')
-0.5818441564000654
stage_p_and_startTime_df.head(5)
| dt | duration | hours | stage | wake | rem | light | deep | order | date | startTimeWake | startTimeSleep | startTimeHours | startTimeSleepHours | endTimeHours | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021-06-10 01:01:30 | 8.550000 | 1.308333 | wake | 1.308333 | 1.125 | 5.041667 | 1.083333 | 3 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 1 | 2021-06-10 01:01:30 | 8.550000 | 5.041667 | light | 1.308333 | 1.125 | 5.041667 | 1.083333 | 1 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 2 | 2021-06-10 01:01:30 | 8.550000 | 1.125000 | deep | 1.308333 | 1.125 | 5.041667 | 1.083333 | 0 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 3 | 2021-06-10 01:01:30 | 8.550000 | 1.083333 | rem | 1.308333 | 1.125 | 5.041667 | 1.083333 | 2 | 2021-06-10 | 01:01:30 | 01:11:00 | 25.025000 | 25.183333 | 9.575 |
| 4 | 2021-06-17 00:35:00 | 9.041667 | 1.358333 | wake | 1.358333 | 1.550 | 4.675000 | 1.466667 | 3 | 2021-06-17 | 00:35:00 | 00:42:00 | 24.583333 | 24.700000 | 9.625 |
stage_p_and_startTime_df['date'] = pd.to_datetime(stage_p_and_startTime_df['date'] )
weather_df_selected['date'] = pd.to_datetime(weather_df_selected['date'] )
corr_temp_stages_df= weather_df_selected.merge(right=stage_p_and_startTime_df, on='date',how='inner' )
cols = list(corr_temp_stages_df.columns)
print(cols)
cols = [v for v in cols if v not in ['sum_glob_strahlung_Wh/m2','Unnamed: 0','Unnamed: 0.1','avg_temp20cm','avg_temp200cm','order']]
corr_temp_stages_df[cols].corr().style.background_gradient(cmap='coolwarm')
['index', 'Unnamed: 0', 'Unnamed: 0.1', 'Tag', 'Stunde', 'avg_rel_Humidpercent', 'avg_temp20cm', 'avg_temp200cm', 'avg_boden_temp5cm', 'avg_boden_temp20cm', 'avg_wind_velocity2_5meter', 'sum_glob_strahlung_Wh/m2', 'niederschlag_nn', 'dt_x', 'date', 'dt_y', 'duration', 'hours', 'stage', 'wake', 'rem', 'light', 'deep', 'order', 'startTimeWake', 'startTimeSleep', 'startTimeHours', 'startTimeSleepHours', 'endTimeHours']
| index | avg_rel_Humidpercent | avg_boden_temp5cm | avg_boden_temp20cm | avg_wind_velocity2_5meter | niederschlag_nn | duration | hours | wake | rem | light | deep | startTimeHours | startTimeSleepHours | endTimeHours | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| index | 1.000000 | 0.513817 | -0.882033 | -0.895784 | -0.032438 | 0.088037 | 0.167196 | 0.016957 | 0.356931 | 0.093791 | -0.086619 | 0.097360 | -0.501362 | -0.465857 | -0.471238 |
| avg_rel_Humidpercent | 0.513817 | 1.000000 | -0.584061 | -0.594692 | -0.397967 | 0.026214 | -0.138523 | -0.014049 | 0.134161 | -0.099660 | -0.102364 | -0.004471 | -0.316119 | -0.336832 | -0.469919 |
| avg_boden_temp5cm | -0.882033 | -0.584061 | 1.000000 | 0.992435 | 0.207793 | 0.023000 | -0.075899 | -0.007697 | -0.239295 | -0.036348 | 0.084703 | -0.093840 | 0.514882 | 0.474641 | 0.551809 |
| avg_boden_temp20cm | -0.895784 | -0.594692 | 0.992435 | 1.000000 | 0.188867 | -0.009637 | -0.067536 | -0.006849 | -0.252378 | -0.057406 | 0.110256 | -0.099637 | 0.496746 | 0.457655 | 0.536402 |
| avg_wind_velocity2_5meter | -0.032438 | -0.397967 | 0.207793 | 0.188867 | 1.000000 | 0.574999 | 0.107966 | 0.010950 | 0.056115 | -0.045451 | 0.144751 | -0.123512 | -0.028927 | -0.031227 | 0.042454 |
| niederschlag_nn | 0.088037 | 0.026214 | 0.023000 | -0.009637 | 0.574999 | 1.000000 | 0.036643 | 0.003716 | -0.046282 | 0.215166 | -0.044394 | -0.067343 | -0.072533 | -0.079501 | -0.059353 |
| duration | 0.167196 | -0.138523 | -0.075899 | -0.067536 | 0.107966 | 0.036643 | 1.000000 | 0.101417 | 0.638256 | -0.138149 | 0.595651 | 0.134661 | -0.464541 | -0.447940 | 0.125227 |
| hours | 0.016957 | -0.014049 | -0.007697 | -0.006849 | 0.010950 | 0.003716 | 0.101417 | 1.000000 | 0.064730 | -0.014011 | 0.060409 | 0.013657 | -0.047113 | -0.045429 | 0.012700 |
| wake | 0.356931 | 0.134161 | -0.239295 | -0.252378 | 0.056115 | -0.046282 | 0.638256 | 0.064730 | 1.000000 | -0.370498 | 0.396465 | -0.082317 | -0.610676 | -0.581844 | -0.274740 |
| rem | 0.093791 | -0.099660 | -0.036348 | -0.057406 | -0.045451 | 0.215166 | -0.138149 | -0.014011 | -0.370498 | 1.000000 | -0.752701 | 0.515303 | 0.122070 | 0.102455 | 0.084231 |
| light | -0.086619 | -0.102364 | 0.084703 | 0.110256 | 0.144751 | -0.044394 | 0.595651 | 0.060409 | 0.396465 | -0.752701 | 1.000000 | -0.574646 | -0.215444 | -0.203396 | 0.122912 |
| deep | 0.097360 | -0.004471 | -0.093840 | -0.099637 | -0.123512 | -0.067343 | 0.134661 | 0.013657 | -0.082317 | 0.515303 | -0.574646 | 1.000000 | -0.017965 | -0.013591 | 0.069056 |
| startTimeHours | -0.501362 | -0.316119 | 0.514882 | 0.496746 | -0.028927 | -0.072533 | -0.464541 | -0.047113 | -0.610676 | 0.122070 | -0.215444 | -0.017965 | 1.000000 | 0.985307 | 0.799889 |
| startTimeSleepHours | -0.465857 | -0.336832 | 0.474641 | 0.457655 | -0.031227 | -0.079501 | -0.447940 | -0.045429 | -0.581844 | 0.102455 | -0.203396 | -0.013591 | 0.985307 | 1.000000 | 0.791977 |
| endTimeHours | -0.471238 | -0.469919 | 0.551809 | 0.536402 | 0.042454 | -0.059353 | 0.125227 | 0.012700 | -0.274740 | 0.084231 | 0.122912 | 0.069056 | 0.799889 | 0.791977 | 1.000000 |
print('c =',get_corr(corr_temp_stages_df['startTimeHours'],corr_temp_stages_df['avg_temp200cm']))
c1 = alt.Chart(corr_temp_stages_df[['startTimeHours','avg_temp200cm']],title='Correlation between avg_temp200cm and startTimeHours').mark_point(size=15,color=C.light).encode(x=alt.X('startTimeHours',scale=alt.Scale(domain=[21,28])),y='avg_temp200cm')
corr_temp_startTime = c1 + c1.mark_line().transform_regression('startTimeHours','avg_temp200cm')
corr_temp_startTime
c = 0.5168992733769734
df_sunrise_set
| dt | date | sunrise | sunset | |
|---|---|---|---|---|
| 0 | 2021-01-01 09:00:00 | 2021-01-01 | 9 | 15 |
| 1 | 2021-02-01 08:00:00 | 2021-02-01 | 8 | 16 |
| 2 | 2021-03-01 07:00:00 | 2021-03-01 | 7 | 17 |
| 3 | 2021-04-01 06:00:00 | 2021-04-01 | 6 | 18 |
| 4 | 2021-05-01 05:00:00 | 2021-05-01 | 5 | 18 |
| ... | ... | ... | ... | ... |
| 361 | 2021-07-31 05:00:00 | 2021-07-31 | 5 | 19 |
| 362 | 2021-08-31 06:00:00 | 2021-08-31 | 6 | 18 |
| 363 | 2020-10-31 07:00:00 | 2020-10-31 | 7 | 16 |
| 364 | 2021-10-31 07:00:00 | 2021-10-31 | 7 | 10 |
| 365 | 2020-12-31 09:00:00 | 2020-12-31 | 9 | 15 |
366 rows × 4 columns
# Sunrise sunset:
# not really only 0.3 sunrise - wake
df_sunrise_set['date'] = pd.to_datetime(df_sunrise_set['date'] )
corr_temp_stages_df= corr_temp_stages_df.merge(right=df_sunrise_set,on='date',how='left')
cols = list(corr_temp_stages_df.columns)
print(cols)
cols = [v for v in cols if v not in ['sum_glob_strahlung_Wh/m2','Unnamed: 0','Unnamed: 0.1','avg_temp20cm','avg_temp200cm','order']]
corr_temp_stages_df[cols].corr().style.background_gradient(cmap='coolwarm')
['index', 'Unnamed: 0', 'Unnamed: 0.1', 'Tag', 'Stunde', 'avg_rel_Humidpercent', 'avg_temp20cm', 'avg_temp200cm', 'avg_boden_temp5cm', 'avg_boden_temp20cm', 'avg_wind_velocity2_5meter', 'sum_glob_strahlung_Wh/m2', 'niederschlag_nn', 'dt_x', 'date', 'dt_y', 'duration', 'hours', 'stage', 'wake', 'rem', 'light', 'deep', 'order', 'startTimeWake', 'startTimeSleep', 'startTimeHours', 'startTimeSleepHours', 'endTimeHours', 'dt', 'sunrise', 'sunset']
| index | avg_rel_Humidpercent | avg_boden_temp5cm | avg_boden_temp20cm | avg_wind_velocity2_5meter | niederschlag_nn | duration | hours | wake | rem | light | deep | startTimeHours | startTimeSleepHours | endTimeHours | sunrise | sunset | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| index | 1.000000 | 0.513817 | -0.882033 | -0.895784 | -0.032438 | 0.088037 | 0.167196 | 0.016957 | 0.356931 | 0.093791 | -0.086619 | 0.097360 | -0.501362 | -0.465857 | -0.471238 | 0.912802 | -0.915639 |
| avg_rel_Humidpercent | 0.513817 | 1.000000 | -0.584061 | -0.594692 | -0.397967 | 0.026214 | -0.138523 | -0.014049 | 0.134161 | -0.099660 | -0.102364 | -0.004471 | -0.316119 | -0.336832 | -0.469919 | 0.470025 | -0.483513 |
| avg_boden_temp5cm | -0.882033 | -0.584061 | 1.000000 | 0.992435 | 0.207793 | 0.023000 | -0.075899 | -0.007697 | -0.239295 | -0.036348 | 0.084703 | -0.093840 | 0.514882 | 0.474641 | 0.551809 | -0.873933 | 0.864540 |
| avg_boden_temp20cm | -0.895784 | -0.594692 | 0.992435 | 1.000000 | 0.188867 | -0.009637 | -0.067536 | -0.006849 | -0.252378 | -0.057406 | 0.110256 | -0.099637 | 0.496746 | 0.457655 | 0.536402 | -0.881407 | 0.888123 |
| avg_wind_velocity2_5meter | -0.032438 | -0.397967 | 0.207793 | 0.188867 | 1.000000 | 0.574999 | 0.107966 | 0.010950 | 0.056115 | -0.045451 | 0.144751 | -0.123512 | -0.028927 | -0.031227 | 0.042454 | -0.062947 | -0.061704 |
| niederschlag_nn | 0.088037 | 0.026214 | 0.023000 | -0.009637 | 0.574999 | 1.000000 | 0.036643 | 0.003716 | -0.046282 | 0.215166 | -0.044394 | -0.067343 | -0.072533 | -0.079501 | -0.059353 | -0.017867 | -0.100709 |
| duration | 0.167196 | -0.138523 | -0.075899 | -0.067536 | 0.107966 | 0.036643 | 1.000000 | 0.101417 | 0.638256 | -0.138149 | 0.595651 | 0.134661 | -0.464541 | -0.447940 | 0.125227 | 0.137728 | -0.142209 |
| hours | 0.016957 | -0.014049 | -0.007697 | -0.006849 | 0.010950 | 0.003716 | 0.101417 | 1.000000 | 0.064730 | -0.014011 | 0.060409 | 0.013657 | -0.047113 | -0.045429 | 0.012700 | 0.013968 | -0.014422 |
| wake | 0.356931 | 0.134161 | -0.239295 | -0.252378 | 0.056115 | -0.046282 | 0.638256 | 0.064730 | 1.000000 | -0.370498 | 0.396465 | -0.082317 | -0.610676 | -0.581844 | -0.274740 | 0.308037 | -0.270777 |
| rem | 0.093791 | -0.099660 | -0.036348 | -0.057406 | -0.045451 | 0.215166 | -0.138149 | -0.014011 | -0.370498 | 1.000000 | -0.752701 | 0.515303 | 0.122070 | 0.102455 | 0.084231 | 0.031733 | -0.092315 |
| light | -0.086619 | -0.102364 | 0.084703 | 0.110256 | 0.144751 | -0.044394 | 0.595651 | 0.060409 | 0.396465 | -0.752701 | 1.000000 | -0.574646 | -0.215444 | -0.203396 | 0.122912 | -0.076387 | 0.079233 |
| deep | 0.097360 | -0.004471 | -0.093840 | -0.099637 | -0.123512 | -0.067343 | 0.134661 | 0.013657 | -0.082317 | 0.515303 | -0.574646 | 1.000000 | -0.017965 | -0.013591 | 0.069056 | 0.139317 | -0.107043 |
| startTimeHours | -0.501362 | -0.316119 | 0.514882 | 0.496746 | -0.028927 | -0.072533 | -0.464541 | -0.047113 | -0.610676 | 0.122070 | -0.215444 | -0.017965 | 1.000000 | 0.985307 | 0.799889 | -0.403367 | 0.522529 |
| startTimeSleepHours | -0.465857 | -0.336832 | 0.474641 | 0.457655 | -0.031227 | -0.079501 | -0.447940 | -0.045429 | -0.581844 | 0.102455 | -0.203396 | -0.013591 | 0.985307 | 1.000000 | 0.791977 | -0.360205 | 0.485864 |
| endTimeHours | -0.471238 | -0.469919 | 0.551809 | 0.536402 | 0.042454 | -0.059353 | 0.125227 | 0.012700 | -0.274740 | 0.084231 | 0.122912 | 0.069056 | 0.799889 | 0.791977 | 1.000000 | -0.376857 | 0.513833 |
| sunrise | 0.912802 | 0.470025 | -0.873933 | -0.881407 | -0.062947 | -0.017867 | 0.137728 | 0.013968 | 0.308037 | 0.031733 | -0.076387 | 0.139317 | -0.403367 | -0.360205 | -0.376857 | 1.000000 | -0.827224 |
| sunset | -0.915639 | -0.483513 | 0.864540 | 0.888123 | -0.061704 | -0.100709 | -0.142209 | -0.014422 | -0.270777 | -0.092315 | 0.079233 | -0.107043 | 0.522529 | 0.485864 | 0.513833 | -0.827224 | 1.000000 |
print('c =',get_corr(corr_temp_stages_df['startTimeHours'],corr_temp_stages_df['sunset']))
c1 = alt.Chart(corr_temp_stages_df[['startTimeHours','sunset']],title='Correlation between sunset and startTimeHours').mark_point(size=15,color=C.light).encode(x=alt.X('startTimeHours',scale=alt.Scale(domain=[21,28])),y='sunset')
c1 + c1.mark_line().transform_regression('startTimeHours','sunset')
c = 0.522528554134968